智能论文笔记

StuArt: Individualized Classroom Observation of Students with Automatic Behavior Recognition and Tracking

Huayi Zhou , Fei Jiang , Jiaxin Si , Lili Xiong , Hongtao Lu

分类：计算机视觉

2022-11-06

Each student matters, but it is hardly for instructors to observe all the students during the courses and provide helps to the needed ones immediately. In this paper, we present StuArt, a novel automatic system designed for the individualized classroom observation, which empowers instructors to concern the learning status of each student. StuArt can recognize five representative student behaviors (hand-raising, standing, sleeping, yawning, and smiling) that are highly related to the engagement and track their variation trends during the course. To protect the privacy of students, all the variation trends are indexed by the seat numbers without any personal identification information. Furthermore, StuArt adopts various user-friendly visualization designs to help instructors quickly understand the individual and whole learning status. Experimental results on real classroom videos have demonstrated the superiority and robustness of the embedded algorithms. We expect our system promoting the development of large-scale individualized guidance of students.

translated by 谷歌翻译

LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object Detection

Yi Wei , Zibu Wei , Yongming Rao , Jiaxin Li , Jie Zhou , Jiwen Lu

分类：计算机视觉 | 机器人

2022-03-28

在本文中，我们提出了激光雷达蒸馏，以弥合由不同的激光束引起的3D对象检测的域间隙。在许多现实世界中，大规模生产的机器人和车辆使用的激光点通常比大型公共数据集的光束少。此外，随着LIDARS升级到具有不同光束量的其他产品模型，使用先前版本的高分辨率传感器捕获的标记数据变得具有挑战性。尽管域自适应3D检测最近取得了进展，但大多数方法都难以消除梁诱导的域间隙。我们发现，在训练过程中，必须将源域的点云密度与目标域的点云密度保持一致。受到这一发现的启发，我们提出了一个渐进式框架，以减轻光束诱导的域移位。在每次迭代中，我们首先通过下采样高光束点云来产生低光束伪激光雷达。然后，使用教师学生的框架来将丰富的信息从数据中提取更多的信息。 Waymo，Nuscenes和Kitti数据集的大量实验具有三个不同的基于激光雷达的探测器，这证明了我们激光蒸馏的有效性。值得注意的是，我们的方法不会增加推理的任何额外计算成本。

translated by 谷歌翻译

Road-aware Monocular Structure from Motion and Homography Estimation

Wei Sui , Teng Chen , Jiaxin Zhang , Jiao Lu , Qian Zhang

分类：计算机视觉

2021-12-16

来自运动（SFM）的结构和地面相同估计对自动驾驶和其他机器人应用至关重要。最近，使用深神经网络分别用于SFM和同住估计的深度神经网络。然而，直接应用用于地面平面的现有方法可能会失败，因为道路通常是场景的一小部分。此外，深度SFM方法的性能仍然不如传统方法。在本文中，我们提出了一种方法，了解到以端到端的方式解决这两种问题，提高两者的性能。所提出的网络由深度CNN，姿势CNN和地面CNN组成。分别深度CNN和姿势 - CNN估计致密深度图和自我运动，求解SFM，而姿势 - CNN和地下CNN，接着是相同的相同层求解地面估计问题。通过强制SFM和同情侣估计结果之间的一致性，可以使用除了由搁板分段器提供的道路分割之外的光度损耗和单独的损耗来训练整个网络以结束到结束。综合实验是在基蒂基准上进行的，与各种最先进的方法相比，展示了有希望的结果。

translated by 谷歌翻译

Understanding and Predicting the Memorability of Outdoor Natural Scenes

Jiaxin Lu , Mai Xu , Ren Yang , Zulin Wang

分类：计算机视觉

2018-10-09

令人难忘性测量在闪光后将容易记忆的难忘，这可能有助于设计杂志盖板，旅游宣传材料等。最近的作品对令人难忘的通用图像，对象图像或面部照片的可视化功能。然而，这些方法不能有效地预测户外自然场景图像的令人难忘性。为了克服以前作品的这种缺点，在本文中，我们提供了回答：“究竟是什么让户外自然场景令人难忘的东西”。为此，我们首先建立大规模的户外自然场景图像难忘（LNSIM）数据库，其中包含2,632个户外自然场景图像，其基础令人难忘分数和多标签场景类别注释。然后，类似于以前的作品，我们挖掘了我们的数据库，调查了如何影响户外自然场景的令人难忘程度，中高水平和高水平的手工业。特别是，我们发现场景类别的高级特征与户外自然场景难忘相当相关，深神经网络（DNN）学习的深度特征在预测令人难忘分数方面也是有效的。此外，将具有类别特征的深度特征组合可以进一步提高难忘预测的性能。因此，我们提出了基于端到端的DNN的户外自然场景难忘（DeepnSM）预测器，其利用了学习的类别相关的特征。然后，实验结果验证了我们深度的模型的有效性，超出了最先进的方法。最后，我们试图了解我们Deepnsm模型的良好表现的原因，并研究了我们的Deepnsm模型成功或未能准确预测户外自然场景的令人难忘的情况。代码：github.com/jiaxinlu-home/natural-cene-memorability-dataset。

translated by 谷歌翻译

POTATO: The Portable Text Annotation Tool

Jiaxin Pei , Aparna Ananthasubramaniam , Xingyao Wang , Naitian Zhou , Jackson Sargent , Apostolos Dedeloudis , David Jurgens

分类：自然语言处理 | 人工智能 | 机器学习

2022-12-16

We present POTATO, the Portable text annotation tool, a free, fully open-sourced annotation system that 1) supports labeling many types of text and multimodal data; 2) offers easy-to-configure features to maximize the productivity of both deployers and annotators (convenient templates for common ML/NLP tasks, active learning, keypress shortcuts, keyword highlights, tooltips); and 3) supports a high degree of customization (editable UI, inserting pre-screening questions, attention and qualification tests). Experiments over two annotation tasks suggest that POTATO improves labeling speed through its specially-designed productivity features, especially for long documents and complex tasks. POTATO is available at https://github.com/davidjurgens/potato and will continue to be updated.

translated by 谷歌翻译

MOPRD: A multidisciplinary open peer review dataset

Jialiang Lin , Jiaxin Song , Zhangping Zhou , Yidong Chen , Xiaodong Shi

分类：人工智能 | 自然语言处理 | 机器学习

2022-12-09

Open peer review is a growing trend in academic publications. Public access to peer review data can benefit both the academic and publishing communities. It also serves as a great support to studies on review comment generation and further to the realization of automated scholarly paper review. However, most of the existing peer review datasets do not provide data that cover the whole peer review process. Apart from this, their data are not diversified enough as they are mainly collected from the field of computer science. These two drawbacks of the currently available peer review datasets need to be addressed to unlock more opportunities for related studies. In response to this problem, we construct MOPRD, a multidisciplinary open peer review dataset. This dataset consists of paper metadata, multiple version manuscripts, review comments, meta-reviews, author's rebuttal letters, and editorial decisions. Moreover, we design a modular guided review comment generation method based on MOPRD. Experiments show that our method delivers better performance indicated by both automatic metrics and human evaluation. We also explore other potential applications of MOPRD, including meta-review generation, editorial decision prediction, author rebuttal generation, and scientometric analysis. MOPRD is a strong endorsement for further studies in peer review-related research and other applications.

translated by 谷歌翻译

Towards Accurate Ground Plane Normal Estimation from Ego-Motion

Jiaxin Zhang , Wei Sui , Qian Zhang , Tao Chen , Cong Yang

分类：计算机视觉 | 机器人

2022-12-08

In this paper, we introduce a novel approach for ground plane normal estimation of wheeled vehicles. In practice, the ground plane is dynamically changed due to braking and unstable road surface. As a result, the vehicle pose, especially the pitch angle, is oscillating from subtle to obvious. Thus, estimating ground plane normal is meaningful since it can be encoded to improve the robustness of various autonomous driving tasks (e.g., 3D object detection, road surface reconstruction, and trajectory planning). Our proposed method only uses odometry as input and estimates accurate ground plane normal vectors in real time. Particularly, it fully utilizes the underlying connection between the ego pose odometry (ego-motion) and its nearby ground plane. Built on that, an Invariant Extended Kalman Filter (IEKF) is designed to estimate the normal vector in the sensor's coordinate. Thus, our proposed method is simple yet efficient and supports both camera- and inertial-based odometry algorithms. Its usability and the marked improvement of robustness are validated through multiple experiments on public datasets. For instance, we achieve state-of-the-art accuracy on KITTI dataset with the estimated vector error of 0.39{\deg}. Our code is available at github.com/manymuch/ground_normal_filter.

translated by 谷歌翻译

G-MAP: General Memory-Augmented Pre-trained Language Model for Domain Tasks

Zhongwei Wan , Yichun Yin , Wei Zhang , Jiaxin Shi , Lifeng Shang , Guangyong Chen , Xin Jiang , Qun Liu

分类：自然语言处理

2022-12-07

Recently, domain-specific PLMs have been proposed to boost the task performance of specific domains (e.g., biomedical and computer science) by continuing to pre-train general PLMs with domain-specific corpora. However, this Domain-Adaptive Pre-Training (DAPT; Gururangan et al. (2020)) tends to forget the previous general knowledge acquired by general PLMs, which leads to a catastrophic forgetting phenomenon and sub-optimal performance. To alleviate this problem, we propose a new framework of General Memory Augmented Pre-trained Language Model (G-MAP), which augments the domain-specific PLM by a memory representation built from the frozen general PLM without losing any general knowledge. Specifically, we propose a new memory-augmented layer, and based on it, different augmented strategies are explored to build the memory representation and then adaptively fuse it into the domain-specific PLM. We demonstrate the effectiveness of G-MAP on various domains (biomedical and computer science publications, news, and reviews) and different kinds (text classification, QA, NER) of tasks, and the extensive results show that the proposed G-MAP can achieve SOTA results on all tasks.

translated by 谷歌翻译

Accelerating Inverse Learning via Intelligent Localization with Exploratory Sampling

Jiaxin Zhang , Sirui Bi , Victor Fung

分类：机器学习 | 人工智能

2022-12-02

In the scope of "AI for Science", solving inverse problems is a longstanding challenge in materials and drug discovery, where the goal is to determine the hidden structures given a set of desirable properties. Deep generative models are recently proposed to solve inverse problems, but these currently use expensive forward operators and struggle in precisely localizing the exact solutions and fully exploring the parameter spaces without missing solutions. In this work, we propose a novel approach (called iPage) to accelerate the inverse learning process by leveraging probabilistic inference from deep invertible models and deterministic optimization via fast gradient descent. Given a target property, the learned invertible model provides a posterior over the parameter space; we identify these posterior samples as an intelligent prior initialization which enables us to narrow down the search space. We then perform gradient descent to calibrate the inverse solutions within a local region. Meanwhile, a space-filling sampling is imposed on the latent space to better explore and capture all possible solutions. We evaluate our approach on three benchmark tasks and two created datasets with real-world applications from quantum chemistry and additive manufacturing, and find our method achieves superior performance compared to several state-of-the-art baseline methods. The iPage code is available at https://github.com/jxzhangjhu/MatDesINNe.

translated by 谷歌翻译

AutoCAD: Automatically Generating Counterfactuals for Mitigating Shortcut Learning

Jiaxin Wen , Yeshuang Zhu , Jinchao Zhang , Jie Zhou , Minlie Huang

分类：人工智能 | 自然语言处理

2022-11-29

Recent studies have shown the impressive efficacy of counterfactually augmented data (CAD) for reducing NLU models' reliance on spurious features and improving their generalizability. However, current methods still heavily rely on human efforts or task-specific designs to generate counterfactuals, thereby impeding CAD's applicability to a broad range of NLU tasks. In this paper, we present AutoCAD, a fully automatic and task-agnostic CAD generation framework. AutoCAD first leverages a classifier to unsupervisedly identify rationales as spans to be intervened, which disentangles spurious and causal features. Then, AutoCAD performs controllable generation enhanced by unlikelihood training to produce diverse counterfactuals. Extensive evaluations on multiple out-of-domain and challenge benchmarks demonstrate that AutoCAD consistently and significantly boosts the out-of-distribution performance of powerful pre-trained models across different NLU tasks, which is comparable or even better than previous state-of-the-art human-in-the-loop or task-specific CAD methods. The code is publicly available at https://github.com/thu-coai/AutoCAD.

translated by 谷歌翻译